-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use PhysicalExtensionCodec consistently #10075
Conversation
73c1310
to
a6993eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me -- thank you @joroKr21 . I think some of the backwards compatible API stuff isn't strictly necessary but I also think it is fine to leave
I had some suggestions for additional tests, but otherwise this looks very nice.
/// * `input_schema` - The Arrow schema for the input, used for determining expression data types | ||
/// when performing type coercion. | ||
/// * `codec` - An extension codec used to decode custom UDFs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for improving the comments here as well ❤️
) | ||
} | ||
|
||
// TODO: Make this the public function on next major release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what is on main will effectively become the next major release (38.0.0
) so there is no reason to avoid API changes here.
If we want to backport this to a stable 37.1.0
release (e.g. #9904 ) then we should avoid API changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see so if we want the backwards compatible changes it would be against branch-37
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the breaking changes here in 1f0040c
} | ||
|
||
#[test] | ||
fn roundtrip_distinct_count() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be worth a test for round tripping window functions with a user defined function too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 280feda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @joroKr21 -- thanks again
Which issue does this PR close?
Follup to #9436.
Rationale for this change
We need to use the physical extension codec consistently, so that UDFs embedded in more complex expressions can also be serialized and deserialized. Most importantly in
parse_required_physical_expr
which calls recursively intoparse_physical_expr
and should be providing the original codec.What changes are included in this PR?
Add a
PhysicalExtensionCodec
parameter to more of the deserialization functions that were using aDefaultPhysicalExtensionCodec
previously. I have left the the signature of public functions unchanged but they should be changed in the next major release (left a TODO comments):parse_physical_window_expr
parse_protobuf_hash_partitioning
parse_protobuf_file_scan_config
Are these changes tested?
I extended the test in
roundtrip_physical_plan.rs
to use a more complex expression.Are there any user-facing changes?
No, but left a TODO for future changes.